- Introduction & Motivating Example
- Multi-State AD-PCA
- Simulation Study
- Case Study Results
- Software Package for
R - Summary and References
RPCA is non-optimal because of the autocorrelation and non-linearity / non-stationarity of the data. Some alternatives are:
We combine Adaptive, Dynamic, and Multi-State modifications to PCA.
These statistics are estimated non-parametrically, as discussed by Kazor et al. (2016)
We follow Kazor et al. (2016) and continue updating the original design of Dong and McAvoy (1996). Thus
Change states hourly in sequence, where the three process states are
The process-specific scaling and projection matrices are: \[ \begin{gather} \textbf{P}_2 = \begin{bmatrix} 0 & 0.50 & -0.87 \\ 0 & 0.87 & 0.50 \\ 1 & 0 & 0 \end{bmatrix} & & \boldsymbol\Lambda_2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 0.5 & 0 \\ 0 & 0 & 2 \end{bmatrix} \\ & & \\ \textbf{P}_3 = \begin{bmatrix} 0 & 0.87 & -0.50 \\ -1 & 0 & 0 \\ 0 & 0.50 & 0.87 \end{bmatrix} & & \boldsymbol\Lambda_3 = \begin{bmatrix} 0.25 & 0 & 0 \\ 0 & 0.1 & 0 \\ 0 & 0 & 0.75 \end{bmatrix} \end{gather} \] \(\textbf{P}_2\) and \(\textbf{P}_3\) rotate the features so that the states are at right angles in at least one dimension. \(\boldsymbol\Lambda_2\) and \(\boldsymbol\Lambda_3\) inflate or deflate the feature variances along each principal component.
We set the threshold to \(\alpha = 0.001\) and repeat the following 1000 times:
An observation is flagged as suspicious if the monitoring statistic is beyond the calculated non-parametric threshold. An observation will trigger an alarm if it is the \(k^{th}\) observation in a row to be flagged. For this simulation study, we triggered an alarm at the third consecutive flag.
For each statistic under AD- and MSAD-PCA, measure
Induce one of the following faults at \(s = 8,500\).
| Feature Affected | Shift Fault | Drift Fault | Latent or Error Fault |
|---|---|---|---|
| All Features Equally | Fault 1A | Fault 2A | Fault 3A |
| Feature(s) | \((x, y, z)\) | \((x, y, z)\) | \((x, y, z)\) |
| Each Feature Differently | Fault 1B | Fault 2B | Fault 3B |
| Feature(s) | \((x)\) | \((y, z)\) | \((z)\) |
| Each State Differently | Fault 1C | Fault 2C | Fault 3C |
| Feature(s) in State | \((x, z)\in S_3\) | \(y\in S_2\) | \(y\in S_2\) |
A single draw from the multivariate process under NOC. The vertical black line (21:40 on 2 December) marks time at which the fault would be induced.
Add 2 in all states to feature \(X\) only. This shift in feature \(X\) will infect features \(Y\) and \(Z\) through \(\textbf{P}_2\boldsymbol\Lambda_2\) and \(\textbf{P}_3\boldsymbol\Lambda_3\).
Drift feature \(Y\) by a maximum \(-1.5\) only in state \(\mathcal{S}_2\). This fault is induced after state projections.
Drift the latent variable \(t\) by a maximum of \(+6\) for all features in all states.
Under the Multi-State model:
Under the Single-State model, the pairs of AD-PCA and MSAD-PCA process monitoring statistics perform similarly, so we do not expect significant loss of power when using the MSAD-PCA procedure to detect faults.
We increased the flags to trigger an alarm from 3 to 5 for the real case due to strong serial autocorrelation. Note that these false alarm rates are all greater than or equal to the set threshold level of \(\alpha = 0.1\%\).
| MSAD \(T^2\) | MSAD SPE | AD \(T^2\) | AD SPE | |
|---|---|---|---|---|
| False Alarm Rate | 0.1% | 0.3% | 0.3% | 0.3% |
RmvMonitoringmspProcessData() generates random draws from a serially autocorrelated and nonstationary multi-state (or single-state) multivariate process.mspTrain() trains the projection matrix and non-parametric fault detection thresholds.mspMonitor() assesses incoming process observations and classifies them as normal or abnormal.mspWarning() keeps a running tally of abnormal observations and raises an alarm if necessary.Baggiani, F., Marsili-Libelli, S., 2009. Real-time fault detection and isolation in biological wastewater treatment plants. Water Sci. Technol. 60, 2949–2961. doi:10.2166/wst.2009.723
Chouaib, C., Mohamed-Faouzi, H., Messaoud, D., 2013. Adaptive kernel principal component analysis for nonlinear dynamic process monitoring, in: Control Conference (ASCC), 2013 9th Asian. pp. 1–6. doi:10.1109/ASCC.2013.6606291
Dong, D., McAvoy, T.J., 1996. Batch tracking via nonlinear principal component analysis. AIChE J. 42, 2199–2208. doi:10.1002/aic.690420810
Ge, Z., Yang, C., Song, Z., 2009. Improved kernel PCA-based monitoring approach for nonlinear processes. Chem. Eng. Sci. 64, 2245–2255. doi:10.1016/j.ces.2009.01.050
Kazor, K., Holloway, R.W., Cath, T.Y., Hering, A.S., 2016. Comparison of linear and nonlinear dimension reduction techniques for automated process monitoring of a decentralized wastewater treatment facility. Stoch. Environ. Res. Risk Assess. 30, 1527–1544. doi:10.1007/s00477-016-1246-2
Kresta, J.V., MacGregor, J.F., Marlin, T.E., 1991. Multivariate statistical monitoring of process operating performance. Can. J. Chem. Eng. 69, 35–47. doi:10.1002/cjce.5450690105
Miao, A., Song, Z., Ge, Z., Zhou, L., Wen, Q., 2013. Nonlinear fault detection based on locally linear embedding. J. Control Theory Appl. 11, 615–622. doi:10.1007/s11768-013-2102-2
Sanchez-Fernandez, A., Fuente, M.J., Sainz-Palmero, G.I., 2015. Fault detection in wastewater treatment plants using distributed PCA methods, in: 2015 IEEE 20th Conference on Emerging Technologies Factory Automation (ETFA). pp. 1–7. doi:10.1109/ETFA.2015.7301504
Tenenbaum, J.B., Silva, V. de, Langford, J.C., 2000. A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323. doi:10.1126/science.290.5500.2319
Weinberger, K.Q., Saul, L.K., 2006. Unsupervised learning of image manifolds by semidefinite programming. Int. J. Comput. Vision 70, 77–90. doi:10.1007/s11263-005-4939-z
Wise, B.M., Veltkamp, D.J., Davis, B., Ricker, N.L., Kowalski, B.R., 1988. Principal components analysis for monitoring the West Valley Liquid-Fed Ceramic Melter, in: Management of Radioactive Wastes, and Non-Radioactive Wastes from Nuclear Facilities. Tucson, AZ.